Blar i NTNU Open på forfatter "Jongman, Allard"

Viser treff 1-2 av 2

Mouth2Audio: intelligible audio synthesis from videos with distinctive vowel articulation

Garg, Saurabh; Ruan, Haoyao; Hamarneh, Ghassan; Behne, Dawn Marie; Jongman, Allard; Sereno, Joan; Wang, Yue (Journal article, 2023)

Humans use both auditory and facial cues to perceive speech, especially when auditory input is degraded, indicating a direct association between visual articulatory and acoustic speech information. This study investigates ...
Plain-to-clear speech video conversion for enhanced intelligibility

Sachdeva, Shubam; Ruan, Haoyao; Hamarneh, Ghassan; Behne, Dawn Marie; Jongman, Allard; Sereno, Joan; Wang, Yue (Peer reviewed; Journal article, 2023)

Clearly articulated speech, relative to plain-style speech, has been shown to improve intelligibility. We examine if visible speech cues in video only can be systematically modified to enhance clear-speech visual features ...